ROCm dan HIP: Tutorial Mendalam 10 Bab: Pemutaran Paralel: Memetakan Logika Berurutan ke Thread GPU

Pemutaran Paralel mewakili perubahan mendasar dalam filosofi komputasi dari urutan temporal (melakukan satu hal setelah yang lain) menjadi distribusi spasial (melakukan semua hal sekaligus di seluruh kisi).

1. Heuristik Independensi

Ini adalah aturan emas dalam komputasi GPU: “Kapan pun masalah Anda adalah ‘menerapkan sesuatu secara independen pada N elemen’, ini adalah pemetaan pertama yang harus dicoba.” Pendekatan paralel data ini adalah buah mudah yang bisa dipetik dari percepatan GPU, di mana beban manajemen thread terkalahkan oleh throughput simultan yang sangat besar.

2. Presisi dan Muatan

Kernel HIP biasanya menangani larik besar tipe dasar. Dalam grafis berkinerja tinggi dan pembelajaran mesin, kita sering menggunakan float (presisi tunggal), sementara simulasi ilmiah yang membutuhkan stabilitas numerik ekstrem menggunakan double (presisi ganda).

3. Dari Iterasi ke Penghunian

Dalam kode CPU, prosesor "mengunjungi" data melalui loop. Dalam logika GPU, data "menghuni" sebuah thread. Anda berhenti menulis cara melakukan loop dan mulai menulis apa yang harus dilakukan oleh satu pekerja pada koordinat tertentu.

$$\text{Indeks } i = \text{blockIdx.x} \times \text{blockDim.x} + \text{threadIdx.x}$$

TERMINALbash — 80x24

> Ready. Click "Run" to execute.

QUESTION 1

What is the primary heuristic for deciding if a problem is suitable for the 'Parallel Pivot'?

The problem requires complex recursion.

The problem involves applying an operation independently to N elements.

The problem must be solved in a strict temporal order.

The problem uses only integer arithmetic.

QUESTION 2

In the context of the Parallel Pivot, what does the term 'Occupation' refer to?

The CPU visiting each index in a for-loop.

How many blocks are currently queued in the GPU.

Data 'occupying' a specific thread at a specific coordinate.

The percentage of memory used by the float arrays.

QUESTION 3

Which data types are most commonly handled by HIP kernels for high numerical stability in science?

bool and char

int and long

float and double

void and pointer

QUESTION 4

When pivoting a loop into a kernel, what replaces the loop counter `i`?

The return value of the function.

A global thread identity calculated from grid/block dimensions.

The hipMalloc address.

The host-side iteration variable.

QUESTION 5

Fill in the blank: To ensure production reliability even in basic kernels, you must ______.

Only use float types.

Add explicit error-checking macros everywhere.

Use a single thread per block.

Avoid all boundary checks.